AITopics | intermediate neuron

Collaborating Authors

intermediate neuron

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

1e9491470749d5b0e361ce4f0b24d037-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 17:44:59 GMT

architecture, intermediate neuron, neuron, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > Canada (0.04)

Genre: Contests & Prizes (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Logarithmic Pruning is All You Need

Neural Information Processing SystemsOct-2-2025, 09:43:40 GMT

The Lottery Ticket Hypothesis is a conjecture that every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network.

artificial intelligence, machine learning, neuron, (20 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > Canada (0.04)

Genre: Contests & Prizes (0.35)

Industry: Leisure & Entertainment (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reviews: Full-Gradient Representation for Neural Network Visualization

Neural Information Processing SystemsJan-25-2025, 03:08:57 GMT

Updates based on author feedback: Given that the authors added the digit flipping experiments and obtained good results (albeit with different choices of post-processing) I am increasing my score to a 7. However, ***the increased score is based on good faith that the authors will add the following to their paper***: (1) I think it would be extremely helpful to practitioners if the authors exposed how different choices of post-processing affected the results. What happens to the digit flipping experiments when sign information is discarded? What happens to Remove & Retrain when sign information is retained? Please be as up front as possible about the caveats; practitioners should be made aware that the choice of post-processing is something they need to pay close attention to, or the method may be used in a way that gives misleading results (as mentioned, we have seen this happen before with Guided Backprop).

bias term, explanation, sign information, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Manipulating Sparse Double Descent

Zhang, Ya Shi

arXiv.org Artificial IntelligenceJan-19-2024

This paper investigates the double descent phenomenon in two-layer neural networks, focusing on the role of L1 regularization and representation dimensions. It explores an alternative double descent phenomenon, named'sparse double descent'. The study emphasizes the complex relationship between model complexity, sparsity, and generalization, and suggests further research into more diverse models and datasets. The findings contribute to a deeper understanding of neural network training and optimization.

double descent phenomenon, neural network, regression, (11 more...)

arXiv.org Artificial Intelligence

2401.10686

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)

Genre: Research Report (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Efficient Adversarial Training with Robust Early-Bird Tickets

Xi, Zhiheng, Zheng, Rui, Gui, Tao, Zhang, Qi, Huang, Xuanjing

arXiv.org Artificial IntelligenceNov-29-2022

Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically $0.15\sim0.3$ epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process. Experiments show that the proposed efficient adversarial training method can achieve up to $7\times \sim 13 \times$ training speedups while maintaining comparable or even better robustness compared to the most competitive state-of-the-art adversarial training methods.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2211.07263

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
Asia > China > Shanghai > Shanghai (0.05)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.05)
(14 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets

Chen, Xiaohan, Cheng, Yu, Wang, Shuohang, Gan, Zhe, Wang, Zhangyang, Liu, Jingjing

arXiv.org Artificial IntelligenceDec-31-2020

Deep, heavily overparameterized language models such as BERT, XLNet and T5 have achieved impressive success in many natural language processing (NLP) tasks. However, their high model complexity requires enormous computation resources and extremely long training time for both pre-training and fine-tuning. Many works have studied model compression on large NLP models, but only focusing on reducing inference time while still requiring an expensive training process. Other works use extremely large batch sizes to shorten the pre-training time, at the expense of higher computational resource demands. In this paper, inspired by the Early-Bird Lottery Tickets recently studied for computer vision tasks, we propose EarlyBERT, a general computationally-efficient training algorithm applicable to both pre-training and fine-tuning of large-scale language models. By slimming the self-attention and fully-connected sub-layers inside a transformer, we are the first to identify structured winning tickets in the early stage of BERT training. We apply those tickets towards efficient BERT training, and conduct comprehensive pre-training and fine-tuning experiments on GLUE and SQuAD downstream tasks. Our results show that EarlyBERT achieves comparable performance to standard BERT, with 35 45% less training time. Large-scale pre-trained language models (e.g., BERT (Devlin et al., 2018), XLNet (Yang et al., 2019), T5 (Raffel et al., 2019)) have significantly advanced the state of the art in the NLP field.

earlybert, ticket, training step, (12 more...)

arXiv.org Artificial Intelligence

2101.00063

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Gambling (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Logarithmic Pruning is All You Need

Orseau, Laurent, Hutter, Marcus, Rivasplata, Omar

arXiv.org Machine LearningOct-25-2020

The Lottery Ticket Hypothesis is a conjecture that every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network. An even stronger conjecture has been proven recently: Every sufficiently overparameterized network contains a subnetwork that, at random initialization, but without training, achieves comparable accuracy to the trained large network. This latter result, however, relies on a number of strong assumptions and guarantees a polynomial factor on the size of the large network compared to the target function. In this work, we remove the most limiting assumptions of this previous work while providing significantly tighter bounds: the overparameterized network only needs a logarithmic factor (in all variables but depth) number of neurons per weight of the target subnetwork.

artificial intelligence, machine learning, neuron, (20 more...)

arXiv.org Machine Learning

2006.12156

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre:

Research Report (0.40)
Contests & Prizes (0.34)

Industry: Leisure & Entertainment (0.48)

Technology:

Information Technology > Communications > Networks (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Just One View: Invariances in Inferotemporal Cell Tuning

Riesenhuber, Maximilian, Poggio, Tomaso

Neural Information Processing SystemsDec-31-1998

In macaque inferotemporal cortex (IT), neurons have been found to respond selectively to complex shapes while showing broad tuning ("invariance") with respect to stimulus transformations such as translation and scale changes and a limited tuning to rotation in depth.

invariance, neuron, paperclip, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Just One View: Invariances in Inferotemporal Cell Tuning

Riesenhuber, Maximilian, Poggio, Tomaso

Neural Information Processing SystemsDec-31-1998

invariance, neuron, paperclip, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.05)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Just One View: Invariances in Inferotemporal Cell Tuning

Riesenhuber, Maximilian, Poggio, Tomaso

Neural Information Processing SystemsDec-31-1998

In macaque inferotemporal cortex (IT), neurons have been found to respond selectivelyto complex shapes while showing broad tuning ("invariance") withrespect to stimulus transformations such as translation and scale changes and a limited tuning to rotation in depth.

artificial intelligence, invariance, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback